Replication material for

	Citizen forecasts of Mexican presidential elections, 2000-2024
	
	Andreas Murr, CIDE & University of Warwick, andreas.murr@cide.edu & a.murr@warwick.ac.uk
	
----------------------------------------------------

1) R code

The three files

- study-1.r
- study-2.r
- study-3.r (calling hlm.r)

perform the analyses reported in the paper.

More specifically,

- study-1.r replicates Figure 1, Tables 1, 2
- study-2.r replicates Figure 2, Table 3
- study-3.r replicates Tables 4-9

All figures are saved as pdfs:

- study-1.r creates
  - figure-1.pdf
- study-2.r creates
  - figure-2.pdf  

Tables and numerical examples are printed in the console.

For the file and the corresponding text output see 

- log-file-study-1.pdf
- log-file-study-2.pdf
- log-file-study-3.pdf

2) Data

Each R script loads data set(s):

- Script: study-1.r
- Data:
  - study-1-vote-intentions.rdata
  - study-1-win-forecasts.rdata

- Script: study-2.r
- Data: study-2-data.rdata
- Objects: 
  - aggregated vote share expectations per election
    - exp2000
    - exp2006
    - exp2012
    - exp2018
  aggregated vote intentions per election
    - int2000 
    - int2006
    - int2012
    - int2018
  - actual vote shares
    - v
  - individual vote share expectations per survey
    corresponds to the rows of exp20XX
    - X00    (date: 18.06.2000)
    - X06.1  (date: 11.06.2006)
    - X06.2  (date: 18.06.2006)
    - X06.3  (date: 01.07.2006)
    - X06.4  (date: 02.07.2006)
    - X12    (date: 17.06.2012)
    - X18.1  (date: 27.05.2018)
    - X18.2  (date: 17.06.2018)
  - individual vote intentions per survey
    - X00    (date: 18.06.2000)
    - X06.1  (date: 11.06.2006)
    - X06.2  (date: 18.06.2006)
    - X06.3  (date: 01.07.2006)
    - X06.4  (date: 02.07.2006)
    - X12    (date: 17.06.2012)
    - X18.1  (date: 27.05.2018)
    - X18.2  (date: 17.06.2018)
  
- Script: study-3.r
- Data: study-3-data.rdata
- Objects:
  - sel00 (date: 18.06.2000)
  - sel06 (date: 18.06.2006)
  - sel12 (date: 17.06.2012)
  - sel18 (date: 17.06.2018)
  
In the last data sets, the variables are:
  - err.XXX is the error of the forecast (actual - forecasted) where XXX is the party label
  - female is the gender (0 = male, 1 = female)
  - age (age100) is the age of the respondent in (100) years
  - educ is the level of education (1 = no formal education / primary; 2 = secondary or vocational or equivalent; 3 = high school or equivalent; 4 = college or more)
  - interest is interest in politics (0 = none; 1 = a little; 2 = a lot)
  - pid.XXX scores 1 if the respondent identifies with the party and 0 otherwise, where XXX is the party label
  - time is the number of years that the state in which the respondent lives in had a non-priísta governor
  - clean2 is the sum of whether the elections are perceived to be clean or fraudulent and whether IFE/INE is believed to guarantee impartial elections, where for each variable 0 = fraudulent / no; .5 = don’t know; 1 = clean / yes

3) Functions and dependencies

The file study-2.r writes the functions:

- alpha.post.dirichlet() to perform posterior analysis of model of vote intentions (Dirichlet-Multinomial), and
- mupost.mvnorm() to performs posterior analysis of model of vote share expectations (Inverse-Wishart-Normal).

It depends on the MASS and MCMCpack libraries.

The file study-3.r writes functions to compute the posterior correlations. 
  - invz() computes the inverse of the z trtansform
  - cor.pos() computes the posterior of bivariate correlations [relies on invz()]
  - cor.pos.mul() computes posterior of multivariate correlations [relies on cor.pos()]

The file study-3.r calls hlm.r.

The file hlm.r implements the algorithm fo Cepeda & Gamerman (2000) to fit the Bayesian heteroskedastic linear model. It writes the functions

- hlm() to run the algorithm, and
- summary.hlm() to summaries the posterior.

It relies on the MASS library.

----------------------------------------------------